class: center, middle, inverse, title-slide .title[ #
Fixed Effects
] .author[ ### Christoph Hanck ] .date[ ### Summer 2023 ] --- layout: true <a style="position: absolute;top:5px;left:10px;color:#004c93;" target="Overview" href="https://kaslides.netlify.app/">
</a> --- # Fixed effects <br> **Motivation** <br><br> Fixed effects are often a remedy for two problems: <br> - Difficulty in figuring out the correct causal diagram that tells what to control for. - Difficulty in measuring all the variables that need to be controlled for. <br> .blockquote[ ### Definition: Fixed effects Fixed effects is a method of controlling for all variables, whether they are observed or not, as long as they stay constant within some larger category. ] --- #Fixed effects .vcenter[ .blockquote[ ### Example: Effect of getting electricity on the productivity of a rural town. - **Backdoor**: Geography (towns up in the mountainous hillside will be more difficult to electrify) - The same town when observed in different times will still be on the mountain - By observing the town in different years, any variation explained by town can be removed - Since there is no variation in geography that is not explained by town, after controlling for town geography will be controlled too → There will be no variation in geography left! ]] --- # Fixed effects .vcenter[ .blockquote[ ### Example: Effect of a visit from the German chancellor on a country’s level of trade with Germany <div class="figure" style="text-align: center"> <img src="data:image/png;base64,#/Users/martin/git_projects/KA_slides/FixedEffects/renderthis_84cf515c5d90_files/figure-html/unnamed-chunk-2-1.png" alt="1: Causes of trade with Germany" width="55%" style="display:block; margin-right:auto; margin-left:auto;" /> <p class="caption">1: Causes of trade with Germany</p> </div> ]] --- # Fixed effects .vcenter[ .blockquote[ ###Example: Effect of a visit from the German chancellor on a country’s level of trade with Germany - There are many causes of trade with Germany. Many of which would be difficult to keep track of or even measure. - Several of these variables — the country’s geography, the culture of the population, and the history that country has with Germany — are constant *within* country. - The causal diagram can be redrawn... ]] --- # Fixed effects .vcenter[ .blockquote[ ### Example: Effect of a visit from the German chancellor on a country’s level of trade with Germany The effect can be identified *without* controling for each of the variables — controlling only for country would be enough. <div class="figure" style="text-align: center"> <img src="data:image/png;base64,#/Users/martin/git_projects/KA_slides/FixedEffects/renderthis_84cf515c5d90_files/figure-html/unnamed-chunk-3-1.png" alt="2: Simplified causes of trade with Germany" width="55%" style="display:block; margin-right:auto; margin-left:auto; margin-top:30px" /> <p class="caption">2: Simplified causes of trade with Germany</p> </div> ]] --- # Fixed effects <br> - In fixed effects, **individuals** are controlled for. The **individual** can represent “person,” “company,” “school,” or “country,” and so on. - This idea is to remove variation *between* individuals: Sweeping away all that variation between individuals, any variables that are *fixed over time* can be controlled for. - Since fixed effects removes between variation, it is not a suitable approach when between variation is of interest! **So what does fixed effects actually do to the data?** We isolate **within variation** by finding differences between groups (individuals) and subtract them out. --- exclude: true # Fixed effects .vcenter[ .blockquote[ ###Example: Effect of exercise on the number of times a year a person gets a cold Consider two people A and B. There are variables like “genetics” that are fixed over time and hence can be adjusted for with fixed effects. |Individual|Year|Exercise| |:------:|:------:|:------:| |Person A|2019|5| |Person A|2020|7| |Person B|2019|4| |Person B|2020|3| <br> ]] --- exclude: true # Fixed effects .vcenter[ .blockquote[ ###Example: Effect of exercise on the number of times a year a person gets a cold - To get the within and between variation, the within-individual means are required - The difference between individuals in the means is the between variation |Individual|Year|Exercise|Mean Exercise|Within Exercise| |:------:|:------:|:------:|:------:|:------:| |Person A|2019|5|6.0|-1.0| |Person A|2020|7|6.0|1.0| |Person B|2019|4|3.5|0.5| |Person B|2020|3|3.5|-0.5| <br> ]] --- exclude: true # Fixed effects .vcenter[ .blockquote[ ### Example: Effect of exercise on the number of times a year a person gets a cold. - Subtracting out individual means, we are left with variation from time period to time period, relative to the averages (the **within variation**) - So person A's within variation in exercise compares the fact that in 2019 she exercised half an hour per week more than average, and in 2020 she exercised half and hour per week less than average ]] --- # Fixed effects .vcenter[ .blockquote[ ### Example: Effects of healthy-eating reminders on actual healthy eating - Four subjects have each downloaded an app that, at random times, reminds them to eat healthy - They have chosen how frequently they want reminders, but do not control exactly on which days they come - It is recorded how often they get reminders, and also scored them on how healthy they eat - There are two kinds of variations here: - Variation over time in how intensively each individual gets reminders (treatment) - Variation in their healthy eating ]] --- # Fixed effects .blockquote[ ### Example: Effects of healthy-eating reminders on actual healthy eating—ctd. There might be back doors that influence reminder frequency and also general healthy-eating level. <div class="figure" style="text-align: center"> <img src="data:image/png;base64,#/Users/martin/git_projects/KA_slides/FixedEffects/renderthis_84cf515c5d90_files/figure-html/unnamed-chunk-4-1.png" alt="3: Healthy eating reminders and healthy eating for four individuals" width="65%" style="display:block; margin-right:auto; margin-left:auto;" /> <p class="caption">3: Healthy eating reminders and healthy eating for four individuals</p> </div> ] --- # Fixed effects .blockquote[ ### Example: Effects of healthy-eating reminders on actual healthy eating—ctd. Compute the mean number of reminders and mean healthy eating scores for each individual. <div class="figure" style="text-align: center"> <img src="data:image/png;base64,#/Users/martin/git_projects/KA_slides/FixedEffects/renderthis_84cf515c5d90_files/figure-html/unnamed-chunk-5-1.png" alt="4: Healthy eating reminders and healthy eating for four individuals with individual means" width="65%" style="display:block; margin-right:auto; margin-left:auto;" /> <p class="caption">4: Healthy eating reminders and healthy eating for four individuals with individual means</p> </div> ] --- # Fixed effects .vcenter[ .blockquote[ ###Example: Effects of healthy-eating reminders on actual healthy eating—ctd. - The next task is to remove any variation *between* individuals - All four crosses are slided on top of each other - The data gets centered at 0 on both the `\(X\)` and `\(Y\)` axis when plotted with the four individuals’ within-variation alone ]] --- # Fixed effects .blockquote[ ### Example: Effects of healthy-eating reminders on actual healthy eating—ctd. With the different individuals moved on top of each other, a more clear picture emerges, with a positive relationship between the intensity of reminders and the scores. <div class="figure" style="text-align: center"> <img src="data:image/png;base64,#/Users/martin/git_projects/KA_slides/FixedEffects/renderthis_84cf515c5d90_files/figure-html/unnamed-chunk-6-1.png" alt="5: Healthy eating reminders and healthy eating for four individuals with individual means" width="60%" style="display:block; margin-right:auto; margin-left:auto;" /> <p class="caption">5: Healthy eating reminders and healthy eating for four individuals with individual means</p> </div> ] --- # Fixed effects <br> **Fixed effects can be performed with regression** Fixed effects regressions generally start with a regression equation `$$Y_{it}=\beta_i+\beta_1X_{it} +\epsilon_{it}$$` - `\(X\)` has subscript `\(it\)` indicating that the data varies both between individuals `\(i\)` over time `\(t\)` - The intercept term `\(\beta\)` has subscript `\(i\)`, indexing individuals **Why do individual-specific intercepts yield fixed effect estimates where only within variation is used?** Breaking up the intercept is equivalent to adding a *control for each person*. --- # Fixed effects <br> **How to estimate a regression with individual intercepts?** <br> There are two common methods: <br> **Method 1:** extracting the within variation (absorbing the fixed effects). 1. Obtain `\(Y_{it}-\bar{Y}_i\)`, i.e., for each individual subtract the individual specific mean `\(\bar{Y}_i\)` from the dependent variable `\(Y_{it}\)` 2. Do the same for the independent variables `\(X_{it}-\bar{X_i}\)`. 3. Run the regression `$$Y_{it}-\bar{Y_i}=\beta_0+\beta_1(X_{it}-\bar{X_i})+\epsilon_{it}$$` --- # Fixed effects <br> **How to estimate a regression with individual intercepts?** **Method 2:** adding a binary control variable for the individuals Run the regression `$$Y_{it} = \beta_0 + \beta_{0,1} + \dots +\beta_{0,N-1} + \beta_1 X_{it} + \epsilon_{it}$$` with `\(N\)` the number of individuals. **Caution** - If the model includes an intercept, then including every individual will make the model impossible to estimate, so leave one out, or leave the intercept out (we leave out `\(\beta_{0,N}\)` here)! - That individual without an own intercept is still in the analysis, but it does not get its own coefficient: estimates are interpreted *relative* to that individual. --- # Fixed effects <br> **How to interpret the results of this regression once it is estimated?** <br> - The interpretation of the slope coefficient in a fixed effects model is the same as when controlled for any other variable - The value of `\(R^2\)` gives an insight to the interpretation. `\(R^2\)` is a measure based on how much variation there is in the residuals relative to the dependent variable. The value of `\(R^2\)` lies between 0 and 1 with 0 means that the fitted regressions explains the data well and 0 means it does not - Sometimes researchers will use the fixed individual effects estimates to make claims about the individuals being evaluated --- # Fixed effects <br> **Some notes on fixed effects** <br> - Fixed effects is a good way of controlling for a long list of unobserved variables that are fixed over time when we are interested in the effect of (a few) time-varying variables - Those individual effects do not have the same luxury of controlling for a bunch of unobserved *time-varying* variables - The fixed effects estimates *cannot* be thought of as causal! - Fixed Effects focuses on *within* variation. So the treatment effect estimated will focus a lot more heavily on people with a lot of within variation. --- # Fixed effects <br> **Why not include multiple sets of fixed effects?** <br> Consider the same individuals are observed over multiple years. We then could include - fixed effects for individuals, *comparing individuals to themselves at different periods* of time - fixed effects for year, *comparing individuals to each other* at the same period of time Including both is commonly known as **two-way fixed effects**, `$$Y_{it}=\beta_i+\beta_t+\beta_1X_{it}+\epsilon_{it}.$$` This gives the variation that is left as being variation relative to what is expected given that individual, and given that year. --- # Fixed effects .blockquote[ ### Example: The effect of experience of in-job training on pay at that job - Data is on how much each person earns per year `\(Y_{it}\)` and on how many hours of training they received that year `\(X_{it}\)`. Overall earnings in 2009 were $10,000 below what they normally are (due to recession) - Anthony earns $120,000 per year, a lot more money than the average person ($40,000). In 2009 Anthony only earned $116,000 which is above average, but that can be explained by the fact that Anthony earns a lot. - Given that the individual is Anthony, it is $4,000 below average. - But given that the year is 2009, most people are earning $10,000 less, but Anthony is only earning $4,000 less. - So given that it is Anthony, and given that it is 2009, Anthony's earning in 2009 is $6,000 more than expected. ] --- # Fixed effects <br> **Estimating a model with multiple fixed effects** <br> - ... can be done using the same **binary controls** approach as for the regular (one-way) fixed effects estimator with only one set of fixed effects - However, this can be difficult if there are lots of different fixed effects to estimate. - Unless the fixed effects only have a few categories each, it is generally advised to use a method specifically designed to handle multiple sets of fixed effects. - These methods usually use something called **alternating projections.** --- # Random effects <br> **What it is** - Instead of letting the `\(\beta_is\)` be anything, random effects puts some structure on them - Random effects assumes that the `\(\beta_i\)` come from a known distribution, for example assuming that the `\(\beta_is\)` follow a normal distribution - The idea is to use a weighted average of within and between variation (instead of just the within variation) **Why it can be useful** <br> - More precise estimates (lower standard errors) - Improved estimation of the individual `\(\beta_i\)` effects - Random effects solves the same back door problem that fixed effects does if the `\(\beta_i\)` terms are *unrelated* to the right-hand-side variables `\(X_{it}\)` --- # Fixed effects vs. random effects <br> **Comparison** <br><br> - Fixed effects are used to *solve a research design problem*. - Random effects only solves the *same problem* if the individual fixed effects are *unrelated to the right-hand-side variables*, including our treatment variable That seems unlikely! - For this reason, fixed effects is almost always preferred to random effects - One common way around this problem is to use the **Durbin-Wu-Hausman test** --- # Fixed effects vs. random effects <br><br> .blockquote[ ### Definition: Durbin-Wu-Hausman test The Durbin-Wu-Hausman test is a broad set of tests that compare the estimates in one model against the estimates in another and sees if they are different. ] <br> **The idea of the test** If the estimates of fixed effects and random effects are found to *not* be different (i.e., the test *fails* to reject the null hypothesis), then the relationship between `\(\beta_i\)` and `\(X_{it}\)` probably is not too strong. → Random effects can be used. --- # Fixed effects <br> **Clustered standard errors** <br> **Motivation** - An assumption of the regression model is that the errors are independent. Then parts of the DGP for the outcome variable that are not in the model are effectively random. This assumption is *necessary to obtain correct regression standard errors* ! - However, this assumption is difficult when it comes to fixed effects: - If parts of the DGP that are left out are shared across all an individual's or a group’s observations, there will be correlation. → **Classic standard errors will be wrong**. - For this reason **clustered standard errors** are used --- # Clustered standard errors .vcenter[ .blockquote[ ###Definition: Clustered standard errors Standard errors clustered at the same level as the fixed effect are called *clustered standard errors*. Clustered standard errors calculate the standard errors while allowing some level of correlation between the error terms. <br> **Examples** - For person fixed effects, using standard errors clustered at the person level - For city fixed effects, using standard errors clustered at the city level ]] --- # Clustered standard errors <br> **Conditions** <br> - Clustered Standard Errors are necessary when there is **treatment effect heterogeneity.** That is, the treatment effect must be *different for different individuals*. - Also, either the fixed effect groups/individuals in the data need to be a non-random sample of the population, or, within fixed effect groups/individuals, the treatment variable is assigned in a clustered way. --- #
Fixed effects in nonlinear models <br> **Why might this be a problem?** <br> Recall that adding a set of binary control variables is a method to get rid of between variation. - That is fine in a linear model, but in a nonlinear model we might run into the **incidental parameters problem**, i.e., we can not handle well estimating that many groups → The estimates start getting bad. --- #
Fixed effects in nonlinear models <br> **Why might this be a problem?** <br> - The other methods for getting rid of between variation is subtracting out the within-group means. - This no longer works because the outcome variable is not a continuous variable any more - the mean is no longer a good representation of _what needed to be substracted out to get rid of the between variation._ --- #
Fixed effects in nonlinear models <br> **What to do then?** <br> There are two common approaches to use a fixed effects design when there is a nonlinear dependent variable: 1. Drop the *nonlinear* part and estimate a *linear* model anyway with the nonlinear dependent variable. This is especially common when the dependent variable is binary or ordinal 2. Use a nonlinear model variant that is designed to handle fixed effects properly (Examples will be discussed in the R tutorials.) --- #
Advanced random effects <br> **Hierachical modeling** - Modern approaches to random effects consider models that contain *multiple levels* - This seems natural — in any fixed effects setting, the data is **hierarchical:** At the very least we have multiple observations over time within individuals. E.g., if the model is `$$Y_{it}=\beta_i+\beta_1X_{it}+\epsilon_{it}$$` It can be said that `\(\beta_1\)` has its own model equation, `$$\beta_i=\beta_0+\gamma_1Z_i+\mu_i$$` where `\(Z_i\)` is a set of individual-level variables that determine the individual effect and `\(\mu_i\)` is an error. --- #
Advanced random effects <br> **Hierarchical modeling** - This allows us to deal with the correlation between the `\(\beta_i\)` and the `\(X_{it}\)` becaused it is *modelled directly* - The approach has the nice statistical properties that random effects gives over fixed effects, but *without imposing as many additional assumptions*. - Hierarchical modelling allows to look at the effect of variables that are constant within individual - It also allows explicitly to analyse between and within effects separately --- #
Advanced random effects <br> **Hierarchical modeling** There are two directions to go with this multi-level intuition. <br> **Direction 1** - Calculate within-individual means for each predictor `\(\bar{X_i}\)` - Obtain the within variation for each predictor `\((X_{it}-\bar{X_i})\)` - Regress the dependent variable on the within-individual means `\(\bar{X_i}\)` , the within variation `\((X_{it}-\bar{X_i})\)` and any other individual-level controls `\(Z_i\)` , and use random effects for the intercept - This is referred to as **correlated random effects** --- #
Advanced random effects <br> **Hierarchical modeling** <br> **Direction 2** - The other way is the multi-level intuition, which is whole hog into mixed models, also known as **hierarchical linear models.** - In mixed models stopping at the intercept following a random distribution is not required - The following models can get complex, but they can be used to represent very complex interactions between variables: `$$Y_{it}=\beta_{0i}+\beta_{1i}X_{it}+\epsilon_{it}$$` `$$\beta_{0i}=\beta_0+\gamma_1Z_i+\mu_i$$` `$$\beta_{1i}=\beta_1+\delta_1Z_i+\eta_i$$`